document 



RESUME 



AL 001 582 



ED 024 930 

By- voo Glasersfdd Ernst; Pisarvif Pier Pado 
TKe Moltistore System: MP"2 
Georgia Inst, for RcsearcK Athens. 

Pub Date Nov 68 
Note- 72p. 

EDRS Price MF-S0.50 HC-S3.70 
Descriptors- ♦Computational Lingoisticsi Computer ProgravDSi EnglisK Form Classes (Languages). Kernel 
Sentences. Linguistic Patterns. Machine Tra^nslation. Phrase Structure. *Program<ngi Semantics. * Sentence 
Structure. ♦ Structural Analysis. Structural Grammar. ♦ Syntax 
Identifiers' *Correlational Grammar. Parsing 

The second version of the Multistore Sentence Analysis System, implemented on 
an IBM 360/65. uses a correlational grammar to parse English sentences and 
displays the parsings as hierarchical syntactic structures comparable to tree 
diagrams. Since correlational syntax comprises much that is usually considered 
semantic information, the system demonstrates ways and means of resolving certain 
types of ambiguity that are frequent obstacles to univocal sentence analysis. 
Particular emphasis is given to the "significant address" method of programming which 
was developed to speed up the procedure (processing times, at present, are 0.5”15 
seconds for sentences up to 16 words). By structuring an area of the centra! core in 
such a way that the individual location of bytes becomes significant, the shifting of 
information is avoided; the use of binary masks further simplifies the many operations 
cf comparison required by the procedure. Samples of print-out illustrate some salient 
features of ttie system (Author/MK) 



4V 



t 



ERIC 









THE MULTIsrORe SYSTEM 



M ? ~ a 



Ernst von SlasSricsf eld 
Pier Paolo Pisani 



U.S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE 
OFFICE OF EDUCATION 



[this document has been reproduced exactly as received from the 

PERSON OR ORGANIZATION ORIGINATING IT. POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY. 

























GRANT 

AF05R 1319-67 



MP-2 

Novembert 1968 




THE MULTISfDR? SYSTEM 



M e - a . 

r— Hr 



Ernst von Ghla^dffsf eld 
Pier Paolo Pi'sani 



Scientific Progress Report 



GEORGIA INSTITUTE FOR RESEARCH 

711» C &. 5 Bank Bldg# 
Athens^' Ga# 30601 



ABSTRACT 



The report describes procedure and machine program of 
the second version of the Multistore S^'ntence Analysis 
• System implemented on an IBM 36Q/65. Using a correlation- . 
ar grammar (described in previous reports) the system pars- 
es English sentences and displays the parsings as hierarchic 
al syntactic structures comparable to tree-diagrams* Since 
correlational syntax comprises much that is usually con- 
sidered semantic information, the system demonstrates ways 
and means of resolving certain types of ambiguity that are 
frequent obstacles to univocal sentence analysis. 

Particular emphasis is given to the 'significant 'ddress' 
method of programming which was developed to speed up the 
procedure (processing times, at present, are 0.5-1.5 sec* 
.for sentences up to 16 words). By structuring an area of 
the central core in such a way that the individual locatxon 
of bytes becomes significant, the shifting of information 
is avoided; the use of binary masks further simplifies the 
many operations of comparison required by the procedure. 

Samples of print— out illustrate some salient features 
of the system. 
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■ In its general conception the new version of the Multi- 
store prograrn is based on MP-I > which was described in a pre- 
vious report of the group (ILRS T— 10, January 1965)* The de- 
tails of the procedure, however, have undergone considerable 
modification. On the one hand, this was necessary because 
MP-I had been written for use on a GE 425 computer, while 
the machine we have been using since our transfer to the United 
States is an IBM 360/65 (University of Georgia Computer Center) 
whose technical characteristics made necessary a rather far- 
reaching reorganisation of ‘ the procedure;, on the other hand, . 
as the program had to be rewritten in any case, we took this 
opportunity to incorporate in it some of the improvements and . 
new ideas that had been developed during the period of experi- 
mentation with MP-I. 

Description and explanations - from various angles. - of 
> 

the Correlational Grammar underlying this application of the 
Multistore system can be found in previous reports (cfr, Biblio- 
graphy) and’ we, here, limit our exposition to a very brief 
outline of those dispositional aspects of the grammar that 
are indispensable for an adequate understanding of MP-II. 

In principle, the Multistore system can be used as a parser 
with any kind of predictive grammar that supplies the items 
of a given vocabulary with exhaustive and univocal indications 
as to their syntactic combinability in sentences of a natural 
language. Most grammars classify vocabulary items according 
to their general syntactic behaviour (which leads to relatively, 
few but crowded classes, e.g. nouns, verbs, adjectives, etc.) 
and then proceed to subdivide according to the specific or 
’’exceptional” behavior of certain items or groups of items. 

One might call this the botanist's approach; as with trees or 
floweifs, it is eminently useful with the word items of a na- , 
tural language - provided that the principal purpose of the 



effort is the description of these items. But if the purpose 
is the- interpretation of sentences , i.e. of combinations of 
items, then a classification's usefulness and efficiency de- 
pends on how accurately it explicates and displays the indi- 
vidual combinatorial behaviour of the items ’ involved • 

Correlational grammar was designed specifically for this 
second purpose. It deviates from traditional grammar in that 
it characterises the word-items (i.e. single words and phrases) 
exclusively in functional terms, and not according to their 
phonological or morphological aspects. 

This characterisation in functional terms is achieved, on 
the one hand, by a minute and rigorous discrimination of syn- 
tactic functions (called 'Correlators') and, on the other, by 
assembling each individual word™item's (characterisation in the 
form of a string of indices (Ic's), each of which indicates 
the item's specific possibility of functioning as one term 
(either first or second 'Correlatum ' ) of one particular syn- 
tactic combination. 

In fact the individual combinatorial behaviour of a word- 
item is, in many instances, determined by characteristics which 
would not be regarded as 'syntactic' in the traditional accept- 
etion of that word. This, of course, depends on how one de- 
fines the term 'syntactic'. In the context of correlational 
grammar, a characteristic is called 'syntactic' when it de- 
termines a word-item's eligibility as correlatum of a specific 
correlator; and it is called 'semantic' when it determines the 
compatibility of that word-item with another word-item within 

f ' 

a given correlation (*). 

The Ic-string of a word can be considered its grammatical 
classification; but whereas in traditional grammar, when a giv- 
en word is classified as 'verb', this implies that the word 
can be used as second term in subject-verb constructions , • in 
correlational grammar there is no such generic term for the 

* cf. Jehane Burns, Bibl. (\10 9 ,, 
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word, but instead a more or less numerous group of Ic‘s in 
the word's Ic-string, which specifies exactly the types of 
sub j ect-verb I- correlations the word can enter into. 

Once a large vocabulary is thoroughly classified by Ic- 
assignation, it will obviously be possible to group the vo- 
cabulary's items according to similarities in their Ic-strings 

I 

and thus to arrive once again at syntactic word classes. But 
since our project was undertaken as a feasibility study (to 
explore the applicability of the correlational method to the 
analysis of English sentences), we have, so far, had neither 
the time nor the need to prepare a vocabulary large enough to 
serve as basis for a distributional examination of syntactic 
characteristics, . 

The suitability of the correlational grammar for automatic 
sentence analysis depended - in our view -* on the answers to 
three questions: 

1) Can a correlational grammar account for all the syntact- 

I 

ic structures found in ordinary English sentences and 

• , i 

can it satisfactorily recognise and interpret the struct 
ure of sentences when it is used in a recognition pro- 
cedure or parser? 

♦ 

2 ) Can such a parser be improved by the incorporation of 
semantic data? 

3) Is it possible to program an automatic correlational 
parser so that it will yield the analysis of an average 
English sentence in a reasonably short processing time - 

I * 

i.e. within a few seconds rather than minutes ? 

Question (1) was, in principle, answered by MP-I. Although 
output from that program showed that the grammar with which it 

I 

was woi^king was not sufficiently differentiated in certain 
areas, it clearly demonstrated that the correlational system 
could yield an efficient recognition procedure and, owing to 

I 

its essential open— endedness , was capable of any desired de- 
gree of refinement (without basic changes in its structure)* 
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Refinement of the grammar has been going on continuously 
throughout the .effort, and the operational grammar of today, 
consequently, is much more efficient and reliable than the one 
implemented in MP-I. This process of refinemen.t - which often 
requires painstaking studies of specific types of syntactic 
structure — has not yet reached an even level of sophistica- 
tion in the entire gamut of structures possible in the English 
language. This is due partly to the inevitable delay in the 
machine implementation of the linguist’s latest advances (re- 
writing of rules, re-punching of cards, etc.); and partly it 
is due CO the fact that much of what, originally, had been ex- 
pected to be resolvable only by the introduction of semantic 
data, has turned out to be within the reach of correlational 
syntax. This is so in the case of certain prepositional rela- 
tions (cf. the treatment of 'pseudo-ambiguities' outlined in 
"An Approach to the Semantics of Prepositions", Bibl. I\l°5) 
and instances of the elimination of pseudo— ambiguities in the 
use of the preposition "by" can be seen in the examples of 
output (Appendix I-c) other instances of correlational syntax 
handling problems which previously had been considered 'semant- 
ic' are the resolution of pseudo-ambiguities in the area of 

• I 

predicative adj ectives( *) ( e. g. "John is easy to please" , "John 
a i. eager to please" , "John is likely to go", and "John is kind 
to go") and in the area of phrase governing verbs (**)(e.g. 

'.'he works the land to live'!, "he promised us to come'\"he forc- 
ed us to come", BoC.). 

Given this extension of syntax at the expense of semantics, 
our c proach to the problems of semantics (question 2, above) 
has been somewhat modified. We have become convinced that, at 
the present stage in the development of our parser, it is im- 
portant to exhaust" the possibilities of syntactic ambiguity 

cf. 5ome Adjective Classes Derived from Correlational 
Grammar, Bibl. (\!° 7. 

** A paper on this specific subject is in preparation. 
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resolution before we introduce any system of disambiguation 
cased on semantic factors. This now seems advisable, not be- 
cause we posit an operational precedence of syntax over semant- 
ics (such a precedence is certainly not observable in the ana- 
lysis procedure employed by the human sentence interpreter) , 
but because a small research group cannot possibly deal with 
both areas simultaneously; moreover, a special study investi- 
gating the addition of semantic control mechanisms to the Multi- 
store system is being carried out under a colltteral research 
project (cf. Bibl. N°^ 9, 10). 

I 

On the basis of MP-I , the third question (see above) could 
be answered affirmatively, but with a reservation. The prcces- 
sin^ times for short sentences were, indeed, in the range of 
a •f’nw seconds; but the systefn was so drastically limited as 
to vocabulary, sentence length, and space for grammar improve- 
ment, that it was legitimate to doubt whether the processing 
speed could be kept at a reasonably high level once the system’s 
scope Was increased to xbalistic proportions. 

, I 

With the, reorganisation of the reclassification routines 
(the most complex and slowest part of MP-I) this doubt has 
been obviated; Reclassification in MP-II, thanks to the super- 

i t 

position on the Multistore area (cf. 6.00-6.52, below), has be- 
come the fastest part of the program, and the effect of sentence 
length on the processing time is therefore approaching its mini- 
mum o ‘ 










General Description and 



Outline of the Procedure 



The Multistore Parser accepts English sentences con- 
sisting of words that are contained in the system's vo- 
cabulary* In the vocabulary each word is cliarac terised 
by. strings of indices ( Ic ' s ) which indicate its possibili- 
ties of combining with other words or phrases to form 
.syntactic structures ( Correlations ) . Each Ic specifies 
one syntactic function by means of which the item bear- 
ing that Ic can be correlated to another item, thus form- 
ing a correlation; the Ic also specifies the item's place 
( -Qo^re la ti o n a 1 Function y in the indicated correlation* 

The parsing of an input sentence is effected by match- 
ing the complementary Ic's of its words and — once words 
have been combined — correlations* Two Ic's are considered 
matching when they indicate the same correlator but dif- 
ferent and complementary correlational functions* 

The present version of the system works with approxi- 
mately 300 syntactic functions, and the individual word 
items, therefore, have long strings of Ic's - averaging 
about 30-40* I he number of matching operations (consider- 
ing not only the Ic*s of the single words but also those 
of the word combinations possible within a sentence) As, 
consequently, very high* 

I he Multistore procedure was devised to reduce the 
number of operations and to make them as fast as possible* 
It does this, on the one hand, by spreading a temporal se- 
quence over a spatial area, thus making steps contempo- 
raneous; and, on the other, by giving each Ic a fixed po- 
sition in that area (significant address), so that the 
matching operations can be carried out without the shift— 
ing of information* Although a computer's memory is usual- 
ly considered to be linear, addresses can be arranged so 
as to represent any kind of area* The Multistore can best 
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be visualised as a rectangular area with horizontal lir>es 
anc vertical columns * Each column represents one specific 
Icj Yfhe columns are grouped in eight sections,^ six for the 
diffarenb correlator types (cf» 1«23) y one for recognition 
xndices (cf* 1*25) and oney the first on the lefty for the 
spacification of the item that occupies a given line (cf. 
3.17 and fig. 3). Each line represents an element of the 
sentencey either word or word combination. 

As the sentence to be analysed is inputy the first word 

I 

occupies the first line of the Multistore and its Ic's are 
recorded (by setting ON certain hits) in the bytes that 
constitute the intersections of this first line with the 
columns representing the relevant Ic's. The correlational 
functions of the Ic's are represented by the configuration 
of the bits that are set ON within the Ic's byte; this por- 
tion of the byte then constitutes a ' marker ' for the parti- 
oular Correlation index read in the word's Ic— string. 

When the second wore is input, it occupies the second 
line of the Multistore. While its Ic's are being recorded, 
those whose function indicates the word as a possible right- 
hand item in a correlation made with the preceding item, 
triggers a search of the column (into which its marker is 
being inserted) to see whether the complementary function 
was marked for a preceding item. If such a marker is found 
in a contiguous position (cf. 4.03), it means that a cor- 
relation is possible (subject to the checks explained in 
4.00), and it is immediately recorded as a ' Product ' on 
the next free Multistore line, that is to say, in the part 
of the line reserved for the specification of items; it 
should' be noted that the product is "automatically" speci- 
fied by the place of its constituents: the right-hand piece 
is the item whose Ic-marker was being inserted; the left- 
hand piece is the item represented by the line on which 
the complementary function was spotted; the correlator is 
indicated by the Ic-number of the column in which the com- 
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bination was mada# (Nota that in corralations with 
al word ordar tha laft-hand item is the first correlatum, 
the right-hand item the second correlatum; in, correlations 
reflecting an inversion of the word order in the sentence, 
the roles of 1st and 2nd correlatum are inverted-, bet the 
procedure of correlating them remains the same; cf. 1.23 
and 4.02). 

0 m 

$ ^ 

In this way the matching operation, instead of involv- 
ing the scanning of long strings of indices, is reduced 

I « 

to a simple binary check; and the same binary check yields, 
whenever a correlation is actually found to be. possible, 
the data required to characterise the product. 

Regardless of whether or not a product results from 

$ 

the scanning ' of ■ the column, the procedure goes on to the 

next Ic until it reaches the end of the string of the word- 

) 

item in hand. 

When the last Ic of the item has been dealt with, the 
Reclass ification procedure sets in. Reclassification is' 
the process of assigning correlation indices to products, 
i*e* the process of supplying a word combination with thoaa 
and only those Ic’s.that reflect its correlational possi- 
bilities with regard to other words or word combinations. 

It is the most complex part of the system and it is here 
that MP— 2 shows the most important conceptual advance in 
comparison to MP-1. 

The linguistic aspects of reclassification and the gen- 
eral principles underlying the formulation of the rules 
which govern the reclassification of products correlated 
by the individual correlators have been discussed in a 
previou.s report (ILRS T-14 , Section VI); a list of the 
principal types of rule used in. the system will be found 
under 6.20 ff. below). 

From the point of view of the procedure each correla- 
tor has an individual set of Reclassification Rules - and 
this set we call Reclassification List. Some of these 
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Rules are unconditional, in the sense that every product 
made by the specific correlator will receive the string 
of Ic’s they assign; others are conditioned, i.e. they 
assign a string of Ic*s only if, say, the first correla- 
tum of the product has a certain characteristic. . 

Many of these Rules figure in more than one List. 

Since it would be wasteful of both storage space and 
operational time to keep voluminous tables of Lists and 
Rules outside the computer's work area, a method was de- 
vised to incorporate the relevant data in the Multistore 
area . 

As the Multistore area is structured in columns, every 
one of which is dedicated to one correlator, each indi- 
vidual correlator's column is now used to record that cor- 
relator's reclassification List; and the lines of the 
Multistore area are used to record the Rules in such a 
way that each Ic they assign is indicated by another kind 
of marker at the intersection of that line with the col- 
umn dedicated to that Ic (cf . 3. 17-3. 19 , and Fig. 3,7). 

The reclassification procedure begins with the first 
product recorded in .the left section of the Multistore 
area during the insertion and combination cycle that has 
just ended. The correlator responsible for this product 
determines the column in.whicn the relevant reclassifica- 
tion List will be found. This column is then scanned for 
Rule markers (i.e. a certain configuration of bits, cf. 
6.Q3) which indicate that the line on which the marker 
is found contains the details of a Rule to be applied to 
the product in hand. 

, Whenever a Rule fnarker is found, the scanning of the 
column is interrupted and the line is examined from left 
to right. In the first section of the line there is the 
specification of the Rule, indicating the correlational 
function of the Ic's to be assigned by the Rule and the 
conditions of their assignation. If the product satisfies 
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the conditions, the main part of the line is scanned, 
i.B. the part which crosses the Ic-columns of the 

Multistore area, and a sign (second type of mfirker) is 
found in those columns that represent an Ic to be assign- 
ed by the Rule. In these columns, then, an Ic-marker is 
inserted at the intersection with the line of the pro- 
duct that is being reclassified, and this insertion has 
tKe same effect, and is followed by the same steps, as 
the insertion of an Ic-marker from the string of an in- 
put word ( a reclassified product, thus, is treated in 
exactly the same way as an input word). 

The operational path that starts with the identifica- 
tion of an Ic in the line of a reclassification Rule 
(which is being examined to determine the Ic*s that are 
to be assigned to a newly made product) may thus lead 
to the creation of a new product; and this is why, in 
the program, combination routines and reclassification 
routines are closely interwoven. 

. At the end of every such *detour', however, the scan- 
ning of the Rule's line continues. When the end of the 
line is reached, the procedure returns to the column of 
the List which governs the reclassification of the pro- 
duct in hand, identifies the next Rule marker, and fol- 
lows the path entailed by the conditions and signs en- 
countered in the line that contains the indicated Rule. 

And so it goes on, until there are no more new products 
recorded in the left-hand section of the Multistore area, 

I 

i.e. no products that have not been reclassified. 

Only at this point does the next item enter into the 
combina-tion routine; this new item can be either a se- 
parate sense of the same word, or the next word of the in- 
put sentence. 

When the last word has been dealt with, output begins 
and those products recorded that contain all wards of the 
sentence are printed out and their correlational concatena 
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tion, i.B* their syntactic structure as expressed in 
terms of correlations, is graphically displayed. This 
display, which indicated the words, the correlators 
that connect them, and the hierarchy of correlations, 
is in fact a binary tree-structure and constitutes the 
parsing of the sentence. 
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. I, 

General Data Flow 



1.00 The permanent data base of the Multistore Parser con- 
sists of vocabulary and grammar. The words contained in 
the vocabulary are represented by punched cards and in a 
disc file. The grammar, on the other hand, is not pfysic- 
ally represented in any one place; parts of it are re- 
presented by the strings of correlation indices of the vo- 
cabulary items, parts by the Reclassification Lists and 
Rules, and parts are built into the various combinatorial 
modes of the correlation procedure. 

1.01 The general structure of the system can be set out 
as^ follows: 

A - Permanent Data 




Vocabulary 
Idiomatic Phrases 
Reclassification Lists 
Reclassification Rules 



(1.10-1,26) 

(1.30-1.31) 

(1.40-1.44) 

(1.50-1.54) 



B - Input Data 
C - Multistore Program 
0 - Output 



( 2 . 00 - 2 . 22 ) 

(3.00-7.50) 

(0.OO-8o31) 



(Note: The decimal numbers refer to the relevant paragraphs 
of the text. ) 

For an illustration of the general data flow, see 
Fig 1 , next page. 
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Fig. 1 

Al - cards type 30 and 31, input to establish vo- 
cabulary; 

A2 disc file containing the vocabulary; 

A3 - cards type 10, input of reclassification lists 

A4 - cards type 20, input of reclassification rules 

B1 - cards type 40, input sentence (words); 

cards types 50, 51, 60, punctuation marks; 

B2 - dictionary look-up; 

C - Multistore program; 

D - output, print-outs types a, b, c. 
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A-PermanentData , 

1.10 The vocabulary contains all word-items the system can 

- handle; consequently the system can analyse sentences that 

are composed exclusively of words contained in this vocab- 
ulary* (Since the system was designed for experimentation 
and research, no provision was made to enable the procedure 
to signal, by-pass, or otherwise deal with words that are 
not found in its vocabulary. ) 

1.11 The vocabulary is stored on a disc file (Data Cell). 

The data it contains are recorded on the disc file by means 
of punched cards that are input . previous to the start of an 
analysis operation. 

1.12 The’ vocabulary entry for one word-item consists of data 
which are written into the disc file by means of two dif- 
ferent types of punched card: 5-cards (formerly head-cards) 
and Ic-cards. . 

■ 1.13 An 5-card (punched card type 30) contains the following 

data: 

- Vocabulary Number (cf. 1.14) 

- S-Number (cf.1.15) 

. - G and T code (cf. 1,16) 

- the word in letters (cf. 1,17) 

' 1.14 The Vocabulary Numbers reflect the alphabetic order of 

words in the vocabulary, but they are not a continuous se- 
quence' since, in order to allow for the insertion of new 
words (expansion of the vocabulary), a regular interval of 
20 numbers was left between the words chosen for the initial 
vocabulary; some of these numbers have since been occupied 

p- 

by subsequently added words. 

1.15 The individual 5-Number of an item is a two-digit code 

* 

number which distinguishes the different senges (meanings) 
of the word, regardless whether the difference is semantic 
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or syntactic. Thus there will De, for instance, three" vo- 
cabulary entries (with different S-numbers) for ’’can**: 



can 


(5= 


01) 


moocil auxilia ry 


can 


(5 = 


02) 


verb (= to pack in cans ) 


can 


(5 = 


03) 


noun (=container) 


semantic 


split of 


the modal auxiliary ''can'* 



^ ability and ^ possibility - would be necessary for trans- 
lation into certain languages but has not yet been imple- 
mented in the vocabulary. ) 

The code numbers (01, 02, etc.) do as yet not indicate 
specific types of sense or function^ they merely distin- 
guish homographs from one another.’ In a future version of 
the program we plan to integrate this S-code with a com- 
prehensive grammatical and semantic code of 8 digits. 

1.16 The G - and T-code of the item corresponding to a giv- 
en S-number is a 6-digit code number divided into two 
areas: a 2-digit area G, and a 4-digit area T. 

At present only the area G is being used. It contains a 
rough,, temporary grammatical classification ,, introduced 
provisionally for the purpose of testing reclassification 
routines conditioned by cata in xhat specific coding area. 

The definitive classifications to be coded in the six 
digits of the G&.T area will be developed on the basis of 
research that is still under way. 

1.17 Each S-card bears the alphanumeric representation of 
the word it refers to. The maximal number of letters for 
one word is 12 (longer words have to be truncated or ab- 
breviated ) o 

1.20 For each S-card there is a set of I c-card s (punched 

cards type 31) which bears the Ic-strings of the item cor- 
responding to the 5-card. The Ic's represent the item^s cor- 
relational possibilities and are divided, into separate 
strings according to the correlator type and the correla- 
tor function they indicate. 
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1.21 A Correlation IntJex , or Ic, conoists of 6 digits and 

contains three different data: ' 

1) the correlator number (4 digits), i. e. the number of 
one individual correlator (syntactic function); 

e.g. 4010, which is a verb//object relation; or 
0245, which is the relation of ’spatial proximity’ 
expressed by the preposition ”by‘*. 

2) the correlator type (1 digit); ihe correlator type 
indicates certain basic characteristics of the cor- 
relations for which the correlator is responsible 

( cf . 1,23). 

3) the correlational function CF (1 digit), i.e. the 
indication of the items place in a correlation, 

.which can be; 

CF = 1 - 1st term, or correlatum, of the cor- 

relation; 

CF = 2 - 2nd term, or corr ela turn , of the cor- 

relation; 

CF = 3 - the correlator. 

1.22 In the sentence "see London by night”, for instance, 
the operative correlational functions of the items are: 

’’see”: CFl of correlator 4010 ( verb//ob j ec t ) ; 

' ’’London”: CF2 ” ” ” ” ” • 

the correlator (4010 CF3) responsible for the cor 
relation ’’see London” is implici t ; 

’’see London”: CFl oF correlator 0241 (activity// spe- 

cification of lighting, cf. Appendix -I-c); 

”by” : CF3 of correlator 0241; 

’’night”: CF2 ” ” ” ; 

the correlator (0241 CF3) responsible for the cor 
relation ’’(see London) by (night)” is explicit , 
that is to say, it is expressed by a -specific 
word, namely ” by”. 
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1.23 The coding of correlator types is as follows: 

Type N: implicit correlator; the correlated i^ems 

(correlata), in the word order of the sent- 
ence, have functions: 

CFl + CF2 respectively; 

Type M: implicit correlator; the correlate's func- 

tions are inverted in the word order of the 
sentence: 

CF2 + CFl 

Type V: implicit correlator with obligatory comma; 

the correlate's functions follow the word 
order and the correlate are separated by a 
comma : 

CFl + , + CF2 



E and G: explicit correlators; the correlate's func- 

tions, in the word order of the sentence, 
have the sequence: 

CF; + CF3 + CF2. 

Note : Types E and G differ only in the partial 
combinations that lead to the correlation; in 
type E the item with CFl is combined with the 
item bearing CF3 to form a 'Semiproduct' (cf. 
4.11 ff) previous to combination with CF2; 
in type G the semiproduct is formed by CF3 
and CF2 previous to combination with CFl; this 
second way of combining the three elements was 

I 

introduced for experimental reasons, but is 
not operative with the present data base. 



Type F: explicit correlator; the correlata's functions 

in the word order of the sentence ,■ have the se 
quence : 

CF3 + CF2 + . CFl; : 

Note ; the semiproduct CF2 + CF’l is formed 
previous to combination with CFl. 
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1.24 The combinations of correlator type and correlational 
function that characterise different Ic-strings (cf. 1.20) 
are as follows* 



Nl, 


Ml, 


VI, 


£1, 


Gl, 


FI, 


N2, 


M2, 


V2, 


E2, 


G2, 


F2, 








E3, 


G3, 


F3. 



1.25 Recognition Indices (cf. 3.14), although they are not 
Ic s, are included in the string Nl. They are distinguish- 
able from Ic*s by the fact that their numbers are beyond 
the range of correlator nurnbers and therefore refer to a 
different section of the Multistore area (cf. 6.20 ff). 

1.26 At present the average number of Ic’s pertaining to one 
5-item is 23; the average number of Ic’s on one Ic-card is 
5.65. The word-item with the least Ic’s has 3 Ic’s and 3 
Ic-cards; the item with the most has 136 Ic’s and 22 Ic- 
cards. 

Idiomati c Phrases , i.e. fixed word combinations whose 
meaning as a whole is not equivalent to the meaning of sny 
syntactic structure composed of the same individual words, 
are treated lik6 word-items in the vocabulary. Like single 
words, they are represented by an S-card and a set of Ic- 
ca.rds. 

1.31 A special routine, which runs parallel to the correla- 
tion procedure, spots them in the input sentence and brings 
them into play as vocabulary items the moment the last word 
of the idiomatic phrase has appeared in input. ■’ 

1*40 Reclassification Lists . 

For each correlator-number there is a set of one or 
more cards (depending on the amount of data concerning the 
particular correlation) which contain the code numbers of 
the Reclassification Rules relevant to products made by 
that correlator. 



1*41 Since the reclassification data (Lists and Rules)" re- 
present the systems syntax and are, therefore'-, subject to 
frequent correction during the experimental stage of the 
• work, they are not permanently stored on a disc or similar 
. storage device, but are kept in the form of punched cards 
which are input together with the analysis program. 

The List— cards (punched cards type 10) contain the fol- 
lowing data : 

- a label, which is the Ic whose products are reclassi- 
fied by means of the reclassification Rules indicated 
in that List; 

- a set of Rule-numbers which indicate the relevant 
Rules. 

1.43 In. 'the system, the data contained in the Lists are in- 

serted into their specific positions in the Multistore area. 
The Ic that labels a given List also determines the Ic-col- 
umn into which that List's data are inserted; the Rule- 
numbers determine the lines (of that same Ic-column) where 

I 

the individual Rule-markers are inserted, (cf. 6.02 ff). 

1*44 There are as many Lists as there are correlators in the 
system. At present the system works with 266 correlators 

and the same number of operative Lists. The number of Rules 

' » 

indicated on one List varies between 1 and 50, with an aver- 
age of approximately 20. 

1.50 Reclassification Rules 

The reclassification Rules that have to be applied to 
the products of one specific correlator are indicated by 
their code numbers on the reclassification List (cf.1.42) 
of that Ic; the specifications of the Rules, however, are 
fed into the system separately, on Rule-cards. This di- 
vision of data was fourd to be economical because many 
Rules apply to the products of more than *^ne ,Ic. 

1.51 Reclassification Rules are kept in the form of punched 
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cards which are input together with the analysis program 
(cf. 1.41). 

1.52 For each Rule there is a ^et of one or more punched 
cards (depending on the c'mount of data involved in the 
particular Rule). These Rule-cards (pdnched cards type 20) 
contain : 

- a label, which is the code number of the Rule; 

- a string of one or more Ic’s which specify the 
assignations to the product; 

- indication of the correlator type and the correla- 
tional function to which the Ic’s of the string re- 
fer; 

-- the conditions of assignation (cf. 6.20 ff). 

1.53 In the system, the data contained in the Rule-cards are 

% ' 

inserted into their specific positions in the lines of the 
Flultistore area (cf. 6.02 ff). 

1.54 At present the system works with 270 reclassification 
Rules,' which assign from 1 to 26 Ic's (average approximate- 
ly 6).' 

I 

B-Input Data 

2.00 The input to the system consists of English sentences. 
The syste^m will process any sentence, provided that: 

1) it consists of words that are contained in the opera- 
tional vocabulary; 

2) it does not exceed the length of 16 words; 

I * 

3) it does not contain punctuation marks other than comma, 
full stop, ano question mark. 

2.01 The upper limit of sentence length was provisionally 
fixed at 16 words. The display of a correlational sentence 
structure with 16 word terminals can just be fitted on one 
page of print-out without cutting out any of 'the data re- 
quired for immediate visual checking. Since the purpose of 
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the program is above all experimental, and since 16 words 

* 

are easily enough to exemplify all possible types of syn- 
tactic construction, this. seemed a reasonable limitation — 

• all the more as the sentence-lr=>ngth distribution in scient- 

I 

ific writings has its absolute peak between the sentence- 
lengths of 16 and 19 words (*)• The analysis program would 
require only slight changes if one wanted it to process 
longer sentences. 

2*10 The sentences to be analysed (input sentences) are man- 
ually composed of Input-cards, one for each word of the 
sentence and one for each punctuation mark contained in it. 

I 

2.11 Input-cards representing words' (punched cards type 40) 
contain the following data: 

- vocabulary number of the word; 

V « 

- the word in letters; 

- the 5-numbers of the different senses of the word; 

- and their individual G-codes. 

2.12 The input-cards, type 40, serve to call up from the 
vocabulary the data (cf. 1.12-1.22) concerning the words 
of the- input sentence. 

2.13 Owing to the fact that the Data Cell units of the com- 
puter we have been using were for a long time inaccessible, 
we have been running the Multistore program without them; 
that is to say, the vocabulary is kept in the form of 
punched cards, and the sentences that serve as input for 
the analysis procedure are composed manually. 

2.14 The word— cards relevant to the input sentence are, 
therefore, input after the sentence and the input-cards 
call up the data from these. The switch-over to the disc 
file, however, s fully prepared. 

2.20 Input-cards representing punctuation marks, are independ 

(*) cf. Computational Analysis of Present-Day American Eng- 
lish, by H. Kucera and W. Nelson Francis, Brown University 
Press, 1967. 
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ent cards (punched cards types 50, 51, 60) and must be in- 
serted during the composition of the input sentence arid 
into those places between .the words where the punctuation 
marks actually occur. 

2.21 The punctuation marks foreseen by the present program 
are: 

- fullstop (punched cards, type 50); 

- question mark (punched cards, type 51); 

- comma (punched cards, type 60). 

2.22 The punctuation cards do not refer to items in the vo- 
cabulary; they operate as s.w itching signals and have a di- 
rect effect on the relevant subroutines of the correlation 
procedure. 

C-Multistore Program 

3.00 The Multistore program implements the correlational 
parser, that is to say, it analyses the input sentence as 

f 

to the correlational structures contained and it displays 
in its output the 'complete* structures, i.e. those struct- 
ures that embrace all the words of the sentence. 

3.01 The analysis is based partly on the information con- 
cerning the correlational possibilities of the single word 
items of the sentence (Ic-strings, cf. 1.20 ff) and partly 
on the rules of word-order in English sentences (incorpo- 
rated in the 'Modes* of combination, cf. 4.21-4.23). 

% 

3.02 The procedure can be divided into a number .of different 
s ub-proc ed ur es : 

- input of the sentence; 

- combination procedure (producing correlations of 
words and word-combinations); 

- reclassification procedure (assigning cbf relation in- 
dices to word-combinations); 

- restraints (preventing or eliminating certain combina- 
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tions which are generally possible, but, in the,., parti- 
cular case, impossible owing to some feature of the 
given sentence); 

output (of the correlational structures found to be 
present in the sentence). 



•However, where the program is concerned, the different 
subroutines representing these sub-procedures interact con- 
tinually and cannot be neatly separated. To follow the de- 
of the program, it will be essential to keep in 
mind the iGeneral Description of the Procedure (pp, 6-11), 



I nput 



3,10 The input sentence is composed of punched cards type 40, 
representing the words of the sentence (cf. 2,11 ff), and 
types 50, 51, and 60, representing punctuation marks (cf, 
. 2 , 20 - 2 , 22 ), 



3,11 The machine reads the whole sequence of input cards and 
records the data supplied by them in a work area (Sentence 
Store), During this operation the words are given their 1 11 - 
put Numbers , i.e, sequential numbers from 1 to 16, which 
reflect the word-order of the input sentence. 



3,12 In the record of the first word the first S-number (cf, 
1,15) is then read, and the data corresponding to that item 
are transferred to the Multistore area. 



3,13 Each Ic (cf, 1,21) consists of 6 characters: 

4 characters, xxxx = correlator number; 

1 character, y s correlator type; 

1 character, z = correlational function. 



The character y determines the specific section of a 
transformation table which indicates the operational code 
of the Ic-number characterised by y. 
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Fig. 2 
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3.14 The transformation table is divided into 7 sections^ 

six of them correspond to the different combinatorial ''modes' 
of the six correlator types (cf. 1.23) j the seventh section 
is for Recognition Indices Ri (cf. 6.22), which play no 
part in the making of correlations but serve to specify 
items for the reclassification procedure. 

The different sections of the table thus reflect the di- 
vision into sectioni of the Multistore areaj that is to say, 
those sections of Multistore columns that represent Ic's 
(the 8th and left-most Multistore section (cf. 3.17) needs 
no counterpart i.i the transformation table). 

3.15 The table contains a total of 496 Ic-numbers divided 
into correlator types as follows: 

E: 119 62 

f: ’93 N; 183 

■ ’ G: 2 V ; 7 

Ri; 30 

3.16 The characters xxxx are transformed, by means of the 
transformation table, into relative addresses of the form 
aaa, ranging from 032 through 527, each of which corresponds 
to the specific Ic on the lines of the Multistore area, each 
line extending from position 000 to position 527. 

3.17 Vne Multistore area comprises 330 lines which oc- 
cupy a total of 174,240 bytes. Of these bytes 163,680 
are directly addressable by means of Ic-numbers, whereas 
10,560 bytes (ile. 32 bytes in every line) are not, because 
they contain the pr'-.cC'"<ing definitions of the items repre- 
sented by the lines (left section of the Multis'tore area). 
( See Fig. 3 , page 26. ) 

3.18 The character z, representing the correlational func- 
tion CF of the Ic, determines the configuration of bits 

0, 1, 2, 3, 4, within the byte addressed by the Ic. 
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Bit 


0 


set 


ON 


represents 


CF4 


II 


1 


II 


It 


It 


CF3 


It 


2 


II 


It 


II 


CF2 


It 


3 


II 


•1 


It 


CFl 



V, 



3.19 Bit 4 is set ON whenever one of the bits’O, 1, 2, 
or 3 is set ON. This additional bit can, subsequently, be 
■ set OFF to block access to bits 0, 1, 2, 3 of the specific 

t 

byte. - This is one realisation of the ’blocks’ that im- 
plement correlation restraints (cf. 7.11); in this case 
an individual Ic-marker of an item can be temporarily ex- 
cluded from either combination or reclassification routines 
during the analysis of a sentence. 



bit N°; 0 

. • 1 

2 

3 

•4 

5 

6 
7 



CF4 

CF3 — T 

CF2 

CFl 

•block’ (access to-bits 0, 1, 2, 3 



reclassification coding (cf. 
(6.00 ff. ) 



3.20 Th’e transformation table (cf. 3.13-3.19) thus serves 
to prepare the data taken from the word entry in the vo- 
cabulary for insertion into the operational Multistore. 

Insertion follows the sequence of the words and their 
S-numbers in the sentence store. 

3.21 The first line of the Multistore contains the data cor- 

< 

responding to the first 5-number (sense) of the first word; 
the second line those corresponding to the second 5-number 
of that word, and so on for all its discriminated senses. 



3.22 Correlations can connect only different words (or 
senses of different wcjrds), but not different s.enses of 
one and the same word. The correlation procedure, therefore 
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begins only when the data representing the first S-number 

of the second word are being inserted* ,, 

Frorn this point on, correlations may result from the 

. procedure* They are called P roduc t s « and each one of them 

occupies one line of the Multistore area^ the sequence of 

occupation, therefore, will be determined by the results 

of the correlation procedure as well as by the sequence of 

0 

words in the input sentence* 

3*23 The first lines of the Multistore are always occupied 
by the 5-items of the first word; they constitute ’ Level * 
number 1. Then comes the first 5-item of the second word 
(beginning of level number *2); if this produces a correla- 
tion with one of the 5— items of the first word, the result- 
ing product will occupy the next line of the Multistore* 

The lines containing the 5— items of the second word, there- 
fore, may not be consecutive because there may be product 
lines between them* 

3*30 When the products arising from the last 5-item of a 
word have occupied their lines, the level of that word is 
closed and the next level begins with the insertion of the 
first 5-item of the next word in the input. 

Correlation Procedure 

4*00 The making of a correlation is subordinabed to three 
conditionss 

a) the items to be combined must be contiguous; 

b) the two markers representing the two items respect- 
ively in one and the same Ic-column must be comple- 
mentary ; 

c) there must be no block affecting that correlation* 

4.01 In order to gain both space and speed, it was desirable 

to keep the correlation routine as homogeneous <as possible 

* 

for all types of correlator* For this reasoh the correla- 
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tion routine works in the same direction regardless of 
the word-order of correlata in the sentence; it works from 
right to left, whether the construction be normal or in- 
verted. 

To distinguish between the operational order and the 
syntactical order, we speak of 'right-hand* pieces and 
left-hand pieces when we refer to the correlation routine - 
and it should be remembered that these terms do not neces- 
sarily correspond ro 'first correlatum' and 'second corre- 
latum' r espec tive.,y . 

e.g. In the phrase: "John must", the word "John" is 
both 1st correlatum (CFl) and 'left-hand' piece of the cor- 
relation; in the inverted form of the same correlation, 

"must John", however, "John" is still the 1st correlatum, 
but now it has the position of 'right-hand* piece. 

4.02 In the correlation procedure all attempts to combine 
one item with another start from a right-hand piece (RH), 
i^e. from the later item in the word order of the sentence. 
Thus, whenever an RH piece is inserted, the preceding part 
of the specific Ic-column is searched for a left-hand (LH) 
piece. If no LH is found, no correlation can be made in 
that column. If a LH is found, other checks have to be made 
to ascertain that a correlation is possible. 

(Note: here and in the following, 'right-hand piece' or RH 
refers to a word or word combination as represented by those 
of its Ic's which indicate correlational possibilities with 
the word or word combination on its left; and LH refers to 
a word or word combinatirn as represented by those of its 
Ic's which indicate correlational possibilities with the 
word or word combination on its right. Any word or word com- 
bination, therefore, can be both RH and LH, depending on 
which of its Ic-strings one is considering at the moment.) 

4.04 The first check (cf. 4.00,^) concerns the contiguity of 
the two, items represented by RH and LH respectively. 
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Two items are contiguous if the right-most word of the 
left-hand item is the the word immediately preceding the 
left-most word of the right-hand item# 

e.g. in ’’John works” the two words are contiguous; 

in “John works hard” the word “John” and the pro- 
duct "works hard” are contiguous; 
in “John works hard to make a decent living” the 
product "works hard" is contiguous with the pro- 
duct “to make a decent living", and “John” is con- 
tiguous with the product resulting from these two. 

Contiguity is checked by means of the 'Level-Indication' 
assigned to each word during input (input number) and to 
eactf product during production ^cf. samples of print-outj 
Appendix I-a). 




4.10 The second check (cf.4.00,b) concerns the compatibility 
of RH and LH. This compatibility is determined by the se- 
quence of correlational functions required by each correla- 
tor type* 
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The required sequences are: 



Correlator 


type 


E: 




El 




E3 


+ 


E2 


f 


ft 


It 


G: 




■G1 


+ 


G3 




G2 


f 


•1 ' 


It 


P : 




P3 


+ 


P2 


+ 


PI 


f 


It 


II 


M: 






M2 


+ 


Ml 


f 




If 


II 


N: 






N1 


+ 


N2 


f 




It 


It 


V: 






VI 


+ 


V2 


• 




4.11 Correlator 


types 


E, 


G, 


and 


P have 


three 


elements 


(since the 


correlator 


is 


here e 


xplic 


it, 


i. e. 


it is expres 



sed by a word of the sentence). 

In' order to keep the correlation procedure as homogeneous 
as possible, these explicit correlations are made in two 
steps: the first step correlates two of the three CP's to 
form a Semiproduct , and the second step then correlates this 
semiproduct with the third CP. 

(Besides homogenising the procedure, this makes it pos- 
sible to let certain explicit correlators form adverbial or 
modal phrases,. if this should be desired; e.g. ”he arrived 
in a car” should form an explicit correlation, whereas in 
"he arrived in a rage” the semiproduct "in a rage” should 
be treated as adverbial to "arrived”. This second possibili- 
ty has been implemented in the system; it has not yet been 
used because further text analysis will be required to 
establish its exact range of application. ) 

4.12 Semiproducts, thus, can be reclassified in two differ- 
ent ways: 

1) as parts of a three-piece explicit correlation - and 
in this role they receive correlational function CP4 
of the Ic that was responsible for their formation; 
this CP4 is assigned to them by an automatic rule. 

2) as ordinary correlations (modal, adverbial, etc., 
phrases), in which role they are assigned the re- 
quired Ic*s by ordinary reclassification Rules, in 
the same way as any other product of an implicit cor- 
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relator. 



In all other respects semiproducts (3P's) are treated 
like other items, i.e. like words or ordinary products. 

4.13 The correlation of explicit products is achieved in 
the following sequences; 

Correlators type E; 



Correlators type F; 



« Correlators type G: 



4.20 The correlation procedure is divided into three types 
of mechanism, or Modes . according to the roles the right- 
hand and left-hand pieces play in it. 

4.21 Mode 1 . i.e* the passive role played by LH-items (cf. 
'4.02). All Ic’s representing LH-items are inserted as mark- 
ers into their specific columns of the Multistore and re- 
main passive, as targets, as it were, for the searches trig- 
gered by RH-items. 

This mode applies to Ic's of the following correlator 
types and CF ' s : 

El, Gl, Ml, VI; M2; F3, G3; E4 , F4. 

4.22 Mode 2 . i.e. the active role played by RH-items in the 
production of semiproducts. Ic's representing possible RH- 
items of semiproducts are inserted as markers and trigger a 
search (in the column into which they are inserted) for 
markers of complementary LH-items. 



El 



E3 



F3 



Gl 



5P^E4 E2 

P(E) 

F2 

■ mm mm mm mm mm 

SP^F4 FI 

P(F) 

^ 2 — — 
5P,G4 



•““r“ 

P(G) 
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Mode 2 applies to Ic’s of the following correlator 
types and Cf * s : 

E3 (which combines with El), 

F2 (which combines with F3), 

G2 (which combines with G3). 

4.23 Mode 3 . i.e. the active role played by RH-items in the 
production of full products (correlations). Ic's represent- 
ing possible RH-items of full products are inserted as 
markers into their specific columns and trigger a search 
(in that same column) for markers of complementary LH-items. 

This mode applies to Ic’s of the following correlator 
types and CF ’ s : 

E2 (which combines with E4), 

Fl (which combines with F4), 

G4 (which combines with Gl), 

Ml (which combines with M2), 

N2 ■ (which combines with Nl), 

M2 (which combines with Vl). 

4.30 The third check (cf. 4.00,^) concerns 'bloc.ks' which 
may or may not have been recorded for a specific S-item or 
product during the analysis of a given input sentence. 

A block prevents the item from becoming a left-hand 
correlatum in the subsequent course of the analysis pro- 
cedure. There are two forms of this kind of block: 

a) temporary, which can be removed at a subsequent 
point of the analysis^ 

b) permanent, which remains operative for the duration 
of the sentence analysis in course. 

4.31 The temporary block is indicated by bit 2, the perma- 
nent block by bit 3 of byte 5 (see Fig. 4, p. 27); an example 
of such a block is given under 7*21* 
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Production 

5*00 If the three checks (cf. 4,00) turn out positive, e 
product is generated* 

Since the characteristics of the items that form the 
product are implicit in the locations (addresses) of these 
items within the Multistore area, ’’generating" a product 

f 

' means, in fact, no more than inserting the relevant charac- 
teristics of its components at the head (i.e. the left-most 
part) of the next free Multistore line. 




Fig, 6 

5,01 The data recorded at the head 'of the Multistore line 
are : 

1) Serial number of product or semiproduct; 

2) Ic-column, origin of product; 

3) G &, T code of product; 

(this code is, at present, used for single 
words only ) ; 

' 4) 1st correlatum of the product M 6, its ad- 

dress ) ; 

5) Correlator N° and type (i.e. a number repre- 
senting the increment required to address the 
Multistore column dedicated to that correla- 
tor) ; i 
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2nd correlatum of the product (i#e, its ad- 
dress ) I 

7) Level range of the product (i.e* the lower 
level-number of its first component and the 
higher level-number of its lost component). 

5.02 Production takes place as long as new Ic*s are insert- 
ed on the given level - provided, of course, that they .find 
complementary markers in their column and that the relevant 
checks are positive). 

5.03 The reclassification of newly-made products takes place 
when there are no more Ic*s to be inserted on the given 
level. 



Reclassification 



6.00 The reclassification routine begins by reclassifying 
the first product recorded on, the present level. 

6.01 The data that govern reclassification are contained in 
the Lists and Rules (cf. 1.40-1.54). 



Head of Lines Ic-columns 

I 




Fig, 7 : (l) Rule specifications; (2) Column of producing Ic; 

(3) Rule-markers constituting the Ic*s List; (4) Ic-markers 
constituting the string of the Rule’s assignations. 
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6.02 The reclassification List for the products made by a 
specific correlator is incorporated in the Multistore col- 
umn that represents that correlator, and the reclassifica- 
tion Rules are lincorpora ted in the Multistore lines. 

6.03 Since each List contains the code numbers of the Rules 
that are to be applied to the products of the correlator 
to which it refers, the List’s representation in the spe- 
cific Multistore column consists of Rule-markers in all 
those bytes that are intersections of that column with the 
lines containing the listed Rules, (see Fig. 7, page 35). 

6.04 In reclassification the Multistore column is scanned 
from the top down. When the first Rule-marker is encounter 
ed, the scanning shifts to the head of the line in which 
the marker was found; this left-most section of the line 
(positions 28-31) contains the specifications of the first 
reclassification Rule applicable to the product that is be 
ing reclassified. 



head of line Ic-columns 




Fig. 0 

I 

6.10 Byte 20 of the Rule specification contains the incre- 
mentation value by means of which the address is computed 
at which the modality (i.e. the application routine) for 

I 

the particular Rule is stored* 

Byte 29 is made up in the following ways 
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6.11 Bit 6 - if set ON - blocks the Rule represented by the 
line; this block is used for experimental purposes (for in- 
stance if one wants to process a sentence both with and 
without a specific reclassification Rule). The Rule block 
is set up and removed during the analysis by internal com- 
mands triggered by semantic or other factors. 

6.12 Bit 7 discriminates those Rules which apply to semipro- 
ducts only. 

6.13 Bits 0, 1, 2 , 3 indicate the type of Rule represented 
by the Line in question. 

6.20 Reclassification Rules fall into two groups: 

Rules that assign Ic’s to a given product uncon - 
ditionally , and 

Rules that assign Ic's only if the product that 
is to be reclassified satisfies a specific con - 
dition . 

Actual codification of Rule types: 

1100 rules that specify a string of Ic’s (to be as- 
signed unconditionally or conditionally); 

0100 rules that transfer Ic's from one of the pro- 
duct's correlata to the product (with or with- 
out exceptions); 

0011 special rules to trigger specific 'blocks'; 

0001 special rules for experimentation with data that 

do not usually enter the reclassification cycle. 

6.21 Unconditic nal Rules assign a string of one- or more Ic's 
to the product (P) to be reclassified. 

6.22 Conditional Rules are of four types: 

l) Rules which assign one or more Ic's only if the 1st 
or 2nd correlatum (as specified by the Rule) of the 
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product bsars a given Icy Recognition index» or 
G— code* This condition is given by a character in 
byte 30* (These Rules can also be negative, in 
which case assignation is made only if the spe- 
cified correlatum does not bear a specified Ic, 

Ri , or G-code. ) 

2) Rules which assign one or more Ic’s only if those 
Ic’s are part of the string of the 1st or 2nd cor— 
relatum (as specified by the Rule) of the P* 

3) Rules which assign to the P all Ic’s found in the 
relevant string of the 1st or 2nd correlatum (as 
indicated by the Rule) of the P* 

4) Rules which assign all Ic’s found in the relevant 
string of the 1st or 2nd correlatum (as specified 
by the Rule) with the exception of a set of one 
or more specified Ic’s* (Note that Rules of type 

3 are, in fact. Rules of type 4 with a zero ex- 
ception set* ) 

6.23 Any Rule assigns Ic’s of only one specified correlator 
type and CF, both of which indications are given in byte 31* 
Rules of type 2, 3, and 4, transfer Ic’s from a correlatum 
of the P to the P itself without changing their correlator 
type ox CF; in Rules of type 1 the correlator type and CF 
of the condition-Ic has no bearing on correlator type or CF 
of the Ic’s assignable to the P* 

6.30 In the remaining 496 bytes of the Multistore line - each 
of which bytes constitutes the intersection with one Ic-col- 
umn of the Multistore - the bits 6 and 7 (cf. Fig 7, p.35) 
indicate whether the Ic represented by the particular col- 
umn is assigned by the Rule or not* 

Thus, scanning the bytes of the line, each bit 6 found 
set ON implicitly determines one Ic-number (the one repre- 
sented by the column to which the byte belongs) that is to 
be assigned to the product* - The correlational function of 
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this Ic-number, on the other hand, has already been determ- 
ined by byte 31 of the line that is being scanned*. 

6.31 Once an Ic to be assigned by the Rule has been spotted 
(bit 6 set ‘ON), it immediately effects the insertion of an 
equivalent Ic-marker in the byte that constitutes the inter- 
section of that same Ic-column with the line representing 
the P which is- being reclassified. 

Since both the spotting of the Ic-marker. contained in 
the Rule and the insertion of the equivalent Ic-marker in 
the Ic-string of the P take place in one and the same col- 
umn (which is the column dedicated to that specific Ic), the 
Ic-number itself does not have to be read, recorded, or 
shifted? it remains implicit as locational characteristic 
of the specific column during the entire sequence of opera- 
tions. 

6.32 The end of a re^ jlc ssif ication Rule, i.e. the last Ic to 
be assigned by it, is indicated by bit 7 (in the byte con- 
taining the last Ic-marker); if bit 7 is set ON, it consti- 
tutes the and signal of the particular Rule. 

6.40 Reclassification can be summarised as follows: 

Whenever a product is made in a column X of the Multi- 
store, the characteristics of this product are recorded at 
the head of the next free Multistore line; production then- 

I 

continues until there are no more Ic’s to be inserted on 
that level. ' • • . 

Reclassification then begins with the first of the new- 
ly recorded products,' 

* The correlator number of the product is also the number 

(address) of the column that contains the reclassification 
List relevant to that product. 

The column is then scanned for Rule-markers. 

The lines on which Rule-markers are found contain the 
relevant reclassification Rules. At the head of the line 
the conditions of Ic-assignation are specified; the rest of 
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the line contains markers indicating the specific Ic's to 
be assigned to the product. 

Whenever a marker is encountered it effects the insert- 
ion of a marker of the same Ic in the Ic-string of the pro- 
duct. 

This insertion is in every way equivalent to the insert- 
ion of an Ic-marker originating from an Ic of an input word. 

6.41 As in the case of Ic-markers originating from the Ic- 
string of an input word, this Ic-marker resulting from the 
reclassification of a product may cause a new product to be 
made (cf. 4 L 00 ff); if it does, tfie new product is recorded 
at once at the head of the next free Multistore line (i.e. 

is recorded before the reclassification of the original 
product proceeds). 

6.50 Reclassification procedure is, therefore, continually 
interlin.ked with the correlation procedure (cf. General De- 
scription, pp. 6-11). The reason for this is the consider- 

• able gain in processing time that was made possible by the 
systematic superposition of two essentially different prfv— 
cedures - i.e. correlation and reclassification - in one and 
the same area of signif icati vely structured machine memory. 

Restraints ( * ) 

7.00 Reclassification, by assigning Ic's to products, determ- 
ines the possibilities these products, or word combinations, 
have oT combining with other words. In doing this, it cha— , 
racterises the products in much the same way as the as- 
signation of Ic*s characterises single words. There is, 
however, one important difference. 

* We speak of 'Restraint' when the specific context of a 
product makes it impossible for that product to corre- 
late with another item in a way that would be possible 
and correct in other sentences. 
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7.01 When we assign Ic*s to a word as vocabulary item, we 
necessarily have to consider all correlations into which 
the word can possibly enter; but when the procedure re-- 
classifies a product, i»e. a word combination found to be 
feasible with words of the given sentence, all that has 
to be considered are the correlational possibilities this 
product has within the given sentence . Experience has shown 
that there are many cases where s certain correlational pos- 
sibility of a product, although legitimate in theory, can 
be excluded because of what comes before the product (i.e. 
stands on the left of it) in the input sentence. 

t 

7.10 A simple example is this: 

In the sentence '^the wine is sour", the product "wine 
is sour" does not have to be correlated because the product 
"the wine" necessarily supersedes the correlational pos- 
sibilities the word "wine" has by itself. 

7.11 Restraints of this kind are implemented by ’blocking* 
the relevant Ic-markers of the item. That is to say, in the 
example, once the correlation "the wins" has been produced, 
certain Ic’s of the word "wine" are prevented from forming 
correlations to the right (cf« 3.19). 

7.20 A similar type of restraint can be formulated for cert- 
ain 5-items (senses) of a word which, during the analysis 
of a given Sentence, fits as 2nci correlatum into one of 
several specific correlations. 

7.21 An example is the sentence "my answers were complete": 
When the word "answers" can be correlated as a plural noun 
with the possessive "my", it is impossible for the verb-sense 
of "answers" (3rd person singular, present indicative) to 

be correlated within the same sentence; this other sense of 
the word can, therefore, be eliminated from the analysis, 
and this is achieved by blocking the Ic-markers in the Multi- 
store line containing the relevant 5-card (cf» 4.30 ff)» 
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7*30 Another type of restraint concerns ’complete’ products 
(i.e. products that contain all the words of a sentence) 
which, although grammatically correct, are not acceptable 
as interpretations of the sentence. Such products occur 
when the string of \'\/o rds that constitutes the input sent- 
ence would give rise to a different syntactic interpreta- 
tion if it were preceded or followed by something else# 

7.31, A case in point are sentences containing the form 

’’were” which can be either indicative or subjunctive and 
must, therefore, give rise to two correlational interpreta- 
tions. 

If a string such as "they were leaving” is the whole 
sentence, however, only the indicative interpretation is 
acceptable, since the subjunctive one would require "if” 
or "I wish” or some such conditional expression to precede 

f 

it. 

Thus, although the conjunctive construction is a cor- 
rect interpretation as far as it goes, we can formulate the 
rule that it can be excluded as a ’ non — sentence ’ if it 
stands alone (cf. print— out, Appendix I— d). , • 

7.40 The last type of restraint implemented in the present 
version of the Multisto ?e system is similar to the ’non- 
sentence' exclusion, except that it affects phrases (pro- 
ducts) before the final level. It is applied in cases that 
are akin to the kind described under 7.21, but cannot be 
dealt with by a block because the spurious product is made 
before the desired alternative correlation has turned up in 
the procedure, i.e. before a choice has become possible. 

7.41 In a sentence such as ’’they accepted his apology”, for 
instance, it is inevitable (because necessary if the jent— 
ence ended with ”his”) that the pronominal sense of ”his" 
produces a correlation in which it is the object of the verb; 
and this correlation can be discarded only at a later stage , 
i.e. when the next word has entered into the procedure. How- 
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ever, once the next word has entered and has, in fact, given 
rise to a correlation with the possessive sense of ”his”, 
the preceding verb-object correlation is definitively super- 
seded and can be discarded. The program does this by blocking 
the reclassification of the superseded product, impeding 
thus its entering into further correlations; this retroact- 
ive block is shown in the print-out by the word DISCARD* 

7.50 The capability of blocking certain items or certain Ic’s 
has not.yet been fully exploited, only the most obvious ex- 
clusions have been formulated as rules. A large-scale examine 
tion of texts would certainly bring to light many more in- 
stances of possible blocks; but, given the size of our group 
(one and a half linguists), we are, unfortunately, in no po- 
sition to undertake large-scale text surveys. Gur main ob- 
ject in including at leas 1 some of these rules in the pro- 
gram, was to demonstrate that it can handle restraints of 
this kind and that they go a long way to reduce the product- 
ion cf spurious correlations. 

D - Output 

0.00 The output of the present version of the parser consists 
of three parts: 

a) a graphic display of the correlational structures 
that constitute the parsing of the input sentence; 

b) a print-out of the intermediary products that led 
to the parsing; 

c) a print-out of the Ic-strings assigned to the inter- 
mediary products by reclassification. 



Parts Jb and c_ are invaluable for checki'dg the function- 
ing of the system and its grammar, but would, of course, be 
cut out if the system were applied to text-analysis. 



Print-out begins whei the correlation procedure for one 
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input sentence has come to its end* The end of the correla- 
tion procedure is recognised by the fact that^ after the 
operations of a given level^ a full-stop card (cf. 2*21) 
enters instead of a card representing a next word. (Note 
that points belonging to abbreviations would not be repre-. 
sented by a full-stof. card, but would be part of the relevant 
vocabulary item. ) 

0*02 The first item printed out is the string of words con- 
stituting the input sentence, showing the input numbers of 
the words and the punctuation marks. 

e.g. 01 WE 02 WANTED 03 TO . 04 GO . 

The word-string appears at the top of every page of 
print-outs a, and ^ • 

,0.10'’ The print-out routine then scans, from top to bottom, 

the left-most section of the Multistore area, i.e. the heads 
of the lines, taking into account the product lines only 
(products and semiproducts). 

0.11 When a product is found, it is examined for its level- 
range and printed as one line of print-out Jb. The following 
data appear in print (cf. Appendix I-a): 

Product number: P for product, 5 for semiproduct; 

and four digits for the P-number. 

First correlatumi W for word, P for product; 

H: 

four digits for input- or P-number re- 
spectively; 

two digits for the relevant S-number 
of the word; 

two digits for the G-code of the W. 

Correlator: four digits for the correlator number; 

one digit for the correlator type. 

Second correlatums data as for 1st correlatum. 

Level-range: input number of first word of the P; 

input N° + 1 of second word of the P. 
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0.12 If a product begins at level 01 and ends with a full 
stop, the word COMPLETE is added- to the line in print-out 

0.13 If the product has been blocked by the type of restraint 
described under 7.41, the word DISCARD is added to the line 
,,v in print-out 

0.14 If the product has been recognised as not acceptable 

" 1 (cf. 7.40 ff ) , the words NON-SENTENCE are added to the line 

♦ ' ! 

in print-out 

,0.15 If ■ a ’complete' product is affected by neither of these 
restraints (cf. 0.13, 0.14) print-out ^ takes place. 

■0.20 The display of a 'complete' product's correlational 
structure (print-out £) is composed by retracing, in the 
product records, first all the second correlata involved 
in the product and then all the first correlata; the graph- 
ic arrangement is constructed line by line. 

% 

(A complete explanation of the data in print-outs type ja 
is given in Appendix I-a. ) 

0.30 When print-outs a and Jb have been completed, the 
product records are scanned once more and, whenever a pro- 
duct or semiproduct is encountered, the corresponding Multi- 
store line is scanned for Ic-markers (reclassification). 

0.31 Each Ic-marker found is identified as to correlator 
number, type and correlational function, and the full Ic's 
are printed consecutively, thus displaying the reclassifica- 
tion string of the product in hand (cf. Appendix I-a). 
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APPENDIX I-a 



Graphic Display of Sentence Structure 



(1) Input Sentence _ q i SHE 

( 2 ) Input Numbers OS. 



02 WAS 



03 KIND 



10 



11 



(3) Product 

Numbers 



(5) Abbreviated 
Input Words 



She 




WAS B6 KIN lA. 




(6) Word Classification Code 



1) The input sentence is limited to 16 words, which have 
their pre-established numbered places (two lines of 8 places 
each)# 

2) The input numbers reflect the words' position in the 
sentence; they aye printed out because in the product list 
(see page I— a, 2) the individual words are identifiable only 
by their input numbers. 

3) The product numbers reflect the sequence of production 
in the course of the analysis; the first product number, in 
the left top corner of the graphic display, is the number 

of the displayed product# 

4) The correlator number specifies the correlator re- 
sponsible for the correlation indicated by the dashes on 
either side of it; if the correlation is 'normal' the number 
appears at the end of the line, if it is 'inverted' the num- 
ber stand at the beginning. 

5) At the terminals of the struc]|:urg i, the three first let-.; 



- I-a,2 - 

ters of the relevant word are printed. 

6) the-word code contains a summary of grammatical and 
semantic data (cf. G-code, 1.16); it serves exclusively for 
the identification of the particular word’s sense (S-number). 

« * « 

. Product List 

(Print-out type Jb) Reproduction and explanation on page I-a,3 

V" 

t 

Reclassification Print-out 

This print-out (type £) shows the strings of Ic’s as- 
signed to the products by the reclassification routines. 

It serves the checking of reclassification rules and the 
tracing of- errors. 
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upper level of, product (input 
number + 1 of last word . contained ) 
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punctuation mark at end of P 
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W = word, P = product 

correlator type 
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Complete Parsin g 

Input Sentence: 
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01 


n/ 


06 


09 


S 0022 ) 


p 


001 1 
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Graphic Display (Print-out type ^) : 



P. -J-- 



P 02 7 



P 02:5 



P OIJ 



P 01 1 



P 00 1 



P 002 



i-lt 



+-4010N-+ 



+■ 



•40 ION- + 



•»— 50 ION--J- 



•^-2■310N~•o 



?250N*> + 



3550N-f 



0243P—J- 



WAN P8 ME Ai TO 00 ANS’VS THE CA LET '^'0 RY 



SUN 



Explanation of Correlators; 



2250N 

3550N 

0243E 

4Q10N 

5010N 

2310N' 



subject // past tense of verb, 
framing verb* + object // infinitivej 

the object of the verb is the actor of the infinitive, 
temporal limitation, pos t-terminal , expressed by "by**, 
verb // object, 
definite article // noun. 

"to" // supine (forming infinitive). 



(Note; the sentence "he wanted the machine to answer the let- 
ters by Sunday’* would yield two parsings; one similar to the 
above, and another in which the second correlation from the 
top would be a product of correlator 3670N, in which the sub- 
ject of the verb is the actor of the infinitive, which lat- 
ter expresses the final cause or purpose of the subject’s 
activity. ) 



i.e. clause-governing verb. 
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An Application of Prepositional Analysis 



As an example of the way in which correlational gramma r 
can handle the relations expressed by explicit correlators 
(mostly prepositions and conjunctions) a sample set of eight 
relations collectively represented in English by the pre- 
position *’by” was introduced into the parsing system. 

The eight relations and their individual correlator numb- 
ers are: 



0241 

0243 

0245 

0247 

0249 



Specification of lighting 
e.g. *'to travel by day" 

"we played by moonlight" 

Temporal limitation, post-terminal 
e.g. "she will be gone by Sunday" 

Spatial proximity 

e.g. "he sat by the fire" 

"the house by the church" 

Efficient agent 

e.g. "he was killed by the gangsters" 
Authorship 

e.g. "a book by Hemingway" 

"a sonata by Mozart" 



0251 ’ Itinerary 

e.g. "he arrived by tha fields" 

"I escaped by the back door" 

0253 Means of transport 

e.g. "they travelled by car" 

0255 Method: activity (present participle) 

e.g. "he learnt it by watching professionals" 




These, obviously, are not all the relations that can be 



I-c,2 



expressed by the English "by" (according to a preliminary 
analysis, about 25 can be isolated and defined ~ not tak- 
ing into account subdivisions based on the grammatical 
character of the correlate) j they are, however, among the 
most frequent and are quite sufficient to demonstrate the 
system's capacity to eliminate 'pseudo-ambiguities' (cf. 

An Approach to the Semantics of Prepositions, 1965). 

The words of the parser's vocabulary having been examin- 
ed for their individual correlabili ty within the range of 
the selected eight relations, the required Ic's were assign- 
ed to the vocabulary items and reclassification rules were 
written to assure assignation to relevant products. 

The assignation of these Ic's to words of the vocabulary 
was done intuitively and would, of course, require verifica- 
tion in a large and truly representative corpus of texts be- 
fore it could be considered definitive. Nevertheless, the 
results obtained with a tentative application are very sa- 
tisfactory. 

A sentence such as "I can easily read his book by Sunday" 
yields one interpretation only: 



P Uv3 
P 092 
P 090 
P '*0^9 
P 031 
P 005 
I 



2 1 60N-+ 



2550N-+ 

0243E-+ 



■4010N-+ 



+— 70L6N“ + 



■H“44 1 0i'1'“ + 



CAN i'^i4 bAS 6iM i<bA Pi) HiS CP OUQ $ai BY 



SUN $oi. 




Note; Correlator 0243 is defined as ' Temporal Limitation * 



► » 






Thti sentence "Jrjr.es wnc kii.Iea by the river" yields 
two interpretations: 



P C 1 c 



P Cli: 



P Cil 



P 00 2 



■ZOOON- + 



0247 E- + 



+—50 iO N- + 



? 'J / C.^!“ •<■ 



O o 



y’« A S * > O 



iN ii_ 



CA .nIV 



in which "by" is taken as correlator 0247 > which is de- 
fined as ’ Efficient Areri t* , and 



P C2C + 2G50^- + 

^ « . ^ _ 

P 0 17 . ■{■ 2b70N--f- 

m ^ • 

• • • 

4> • • • 

• * • • 

P v) 1 1 * ^ -^-50 I0.4- + 

• # • • ' 

• ' « • • • 

JC'a a' as jC klL VP eY ■ Thh CA K 1 V I A. 



in which "by" is token as correlator 0245, which is de- 
fined as ^ Spaticil ProxirriT.y * . 

Both parsings are correct, because the sentence is 
genuinely ambiguous. 
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NOM--PENTENCES i S vn t.^.cti.Cc'l Earront Interpretations 
that Cannnt brand hv Themb elves 

The syntactic interpretation of a string of words need 
not necessarily be the same when the string constitutes a 
sentence and when it constitutes only part of a sentence. 

I 

For instance 'the string **the rabbit we let run**, pre- 
sented as a sentence, has to be interpreted as an invers- 
ion of the sentence **we let xhe rabbit run"; if it is found 

as part of the sentence "we kept the squirrel but the rab- 
bit we let run*’, ix must be interpreted in the same way; 

but if it occurs as part of the sentence "they shot the 

rabbit .we let run*’, it must be recognised as a relative 
clause. 

Thus, although the interpretation as a relative clause 
is correct and, indeed, required under certain circum- 
stances, we can provide for it to be excluded as a sentence 
interpretation when the string stands by itself. 

This is implemented by a type of rule which marks cert- 
ain ’complete’ products as NDN-SENTENCE . This comment is 
printed beside the product ana the corresponding record 
prevents the product from being printed out as a graphic 
display. 

These ’product selection’ rules are triggered by the 
correlator number of the largest correlation in the ’com- 
plete’ product, and they can involve, a number of different 
conditions . 

Some of these conditions are: 

- a question mark terminating the string; 

- the first correlatum of the largest correlation be- 
ing a single word; 

- the first correlatum of the correla’i-ion containing 
the first word of the input sentenqe; 

or a combination of these conditions. 
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ie x.'roiguoijs bacausa correla- 
;r.iv coolly defined; it can mean: 
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M G X B t n a t 

xor 3420 has not yc 

a ) □ s a h t boo k' c 

b) will bcokL^ be read? (e*g. at the meeting). 

In spQXen Engli.fn tnis ambigeity is eliminated by stress 
i'e heiVB no way of reoolving it in written text (if the sent 
ence Goes not con tir.'..e ) ; cot for treinsia t ion , for instance 
into Italian, disambigoo tion woold be indispensable because 
output for jy' should oe ”libri cono da ieggere?*' while out- 
put for ^ would h.sva to be *'i.:.i leggeranno libri?”. 

The correlational possibilities of the input string, how 
ever, are not Bxh.'iusl,* d by prv?di,‘ct Do 07. The system records 
three more products, two of union are ‘complete’ but are 
marked as 'non-sentences* (see page I-d,4). 
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Product 0009 is 'complete* but marked as 'non-sentence* 
and therefore does not generate a structure display. Its 
structure, however, can be determined by tracing its com- 
position in the product list (see page I-d, 3). The display 



would look like this: 

P 0009 +- 4210N-+ 

P 0008 I + 3650N-5- 

P 0006 : : + 2570N-+ 

• • • • 

p 0005 : : + — 23iQi\)™-+ : 

• * • • • 

ARE BOOKS TO • BE READ 7 



This is the 'predicative* construction (correlator Nu. 
4210 in the top correlation), that would be the correct in- 
terpretation in a sentence such as "all these are books to 
be read". 

Product 0010 is also 'complete* but marked as 'non- 
sentence'. The display of its structure would look like 



this: . 

P 0010 +-2020M— 

P 0008 I + — 3650N-+ 

P 0006 t : + 2570N-+ 

p 0005 : : i— 2310N— + : 

^ • • • • • 0 
• • • • • 

• • • • • 

ARE BOOKS TO BE READ 7 



This is the structure that would be correct if the in- 
put string were the subject-auxiliary phrase (correlator 
No. 2020 in the top correlation) in a question such as 
"are books to be read necessarily boring7". 

Note that both the above structures contain P 0008, the 
noun phrase "books to be read", which is made by correlator 
No. 3650 (defined as ' Purpose . expressed by infinitive')* 
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Operative Vocabulary 
V o c « N ° Word S G-code 



0020 


a 


01 


CA 


indefinite article 


0080 


am 


01 


B1 


auxiliary 


0100 


an 


01 


CA 


indefinite article 


0140 


answer 


01 


VS 


supine 






02 • 




singular count-noun 


0160 


answered 


01 


VP 


past participle 






02 


V0 


past tense 


0200 


answers 


01 


V3 


3rd person 




- , 


02 


(- 


plural count-noun 


022,0 


are 


01 


B2 


auxiliary 


0300 . 


be 


01 


B5 


aux, supine 






02 


B9 


aux. subjunctive 


0420 


been 


01 


BP 


aux. past participle 


0440 ' 


being 


01 


BG 


aux. pres. participle 






02 


$- 


singular count-noun 


0500 


book 


01 


VS 


supine 






02 




singular count-noun 


0580 


books 


01 


V3 


3rd person 






02 


(- 


plural ' count-noun 


0660 


bright • 


01 


i.A 


adjective 


0800 


by - 


01 


0/ 


correlator (prep.) 






02 


OM 


adverb 


0900 


can 


01 


M4 


modal 






02 


VS 


supine 


• 




03 


$- 


singular count-noun 


0900 


car 


01 




singular count-noun 


1060 


Charles 


01 


^0 


sing, non-count-noun 


1200 


day 


01 




singular count-noun 


1340 


difficult 


01 


IF 


framing adjective 


1530 


eager 


01 


IF 


framing adjective 


1500 


easily 


01 


6M 


descriptive adverb 
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Voc* N° 


Word 


S 


G-code 




1720 


English 


01 


lA 


/. 

adjective 






02 




plur. non-count-noun 


2060 


go 


01 


VS 


supine 


2100 


going 


01 


VG 


preset it participle 


2140 


. gone 


01 


VP 


past participle 






02 


lA 


adjective 


2240 


had 


01 


H8 


auxiliary, past t.. 






02 


HP 


aux*past participle 


2340 


has 


01 


H3 


aux, 3rd person 


2360 


have 


01 


HS 


aux« supine 


2400 


he 


01 


^1 


personal pronoun 


2500 


CD 


01 


A3 


accusative pronoun 






02 


CP 


possessive adjective 


2620 . 


him 


01 


A3 


accusative pronoun 


2660 


his 


. 01 


CP 


possessive adjective 






02 


+P 


possessive pronoun 


2700 J 


I 


01 




personal pronoun 


2900 

• 


in 


01 


0/ 


correlator (prep*) 






02 


Oh 


adverb 


2920 


is 


01 


B3 


aux. 3rd person 


2940 


it 


01 


$3 


persona^, pronoun 




• 




$9 


impersonal pronoun 


2990 . 


Jones 


01 


$0 


sing. non-count-houn 


3070 


killed 


01 


VP 


past p irticip.*’ e 


3000 


kind 


01 


lA 


adjective 






02 


1- 


singular ccunt<-houn 


3260 


knows 


01 


• F3 


framing v.,3rd person 


3320 


larger 


01 


2A 


comparative adjective 


3430 


learn 


01 


FS 


framing v. , supine 


3500 


letters 


01 


(- 


pltral count^noun 


3700 


likely 


01 


IF 


framing adjective- 


3780 


little 


01 


CQ 


determiner * 






02 


$Q 


. pronoun 



I Of 






O 
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Voc. N° 


Word 


S 


G-coc 




little 


03 


m 




(continued ) 


04 


6Q 


4010 


mad 


01 


lA 


4220 


me 


01 


A1 


4460 


my * 


01 


CP 


■4 700 


on 


01 


0/ 






02 


OM 


4800 


one 


01 


CN 






02 


$N 






03 


$l 






04 


$- 


5320 


read 


01 


PS 






02 


$- 






03 


FP 






04 


FB 


5340 


reading 


01 


FG 




k 


02 


$- 


5380 


reads 


01 


F3 


5390 


red 


01 


lA 






02 




5400 


river 


□1 




5420 


sat 


01 


V0 


5540 


she 


01 


$1 


5750 


speak 


01 


VS 


5790 


Sunday 


01 




5900 


the 


01 


CA 


6100 


they 


01 


■ (3 


6100 


three 


01 


CN 






02 




6320 


to 


01 


0/ 








00 


6400 


train 


01 


FL. 






02 








03 


VS 



adjective 
adverb 
ad j active 

accusative pronoun 

possessive adjective 

correlator (prep. ) 

adverb 

numeral 

pronoun 

impersonal pronoun 
‘ singular count-noun 
framing v. , supine 
singular count-noun 
framing past participle 
framing v. , past tense 
framing pres, participle 
singular count-nouh 
framing v. , 3rd person 
ad j active 

sing. non-count-noun 
singular count-noun 
past tense 
personal pronoun 
supine 

singular count-noun 
definite article 
personal pronoun 
numeral 

singular count-noun 
correlator (prep.) 
particle 

framing v.,' supine 
singular count-noun 
supine (intransitive) 
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Voc. N° 


Word_ 


5 


G-code 




6420 


trained 


01 


FP 


framing past participle 






02 


F8 


framing, v. , past tense 


£460 


trains 


01 


F3 


framing v. , 3rd person 






02 


(- 


plural count-noun 






, 03 


V3 


3rd person (intransitive) 


6800 


wanted 


01 


FP 


framing past participle 


. 




02 


F8 


framing v. , past tense 


6860 


was 


01 


B6 


aux. , past tense 


6880 


we 


01 


(1 


personal pronoun 


6900 


went 


01 


V8 


past tense 


6920 


were 


01 


B8 


* aux. , past tense 






02 


B5 


aux., subjunctive 


7520 


you 


01 


+2 


personal pronoun 



Mote ; This vocabulary contains the words with which we 
are at present experimenting (with a view to further cor- 
relation restraints and to the introduction of punctuation 
marks); it does in no way reflect the capacity of the pro- 
gram. It is being kept at a minimum in order to avoid re- 
petitious card punching when corrections and alterations 
are being tested. 
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